164 research outputs found

    Linearity Testing Against a Fuzzy Rule-based Model

    Get PDF
    In this paper, we introduce a linearity test for fuzzy rule-based models in the framework of time series modeling. To do so, we explore a family of statistical models, the regime switching autoregressive models, and the relations that link them to the fuzzy rule-based models. From these relations, we derive a Lagrange Multiplier linearity test and some properties of the maximum likelihood estimator needed for it. Finally, an empirical study of the goodness of the test is presented.fuzzy rule-based models, time series, linearity test, statistical inference

    Testing for Heteroskedasticity of the Residuals in Fuzzy Rule-Based Models

    Get PDF
    In this paper, we propose a new diagnostic checking tool for fuzzy rule-based modelling of time series. Through the study of the residuals in the Lagrange Multiplier testing framework we devise a hypothesis test which allows us to determine if the residual time series is homoscedastic or not, that is, if it has the same variance throughout time. This is another important step towards a statistically sound modelling strategy for fuzzy rule-based models.Spanish Ministerio de Ciencia e Innovaci´on (MICINN) under Project grants MICINN TIN2009- 14575 and CIT-460000-2009-4

    Semantics of Data Mining Services in Cloud Computing

    Get PDF
    M. Parra-Royon holds a "Excelencia" scholarship from the Regional Government of Andaluc a (Spain). This work was supported by the Research Projects P12-TIC-2958 and TIN2016-81113-R (Ministry of Economy, Industry and Competitiveness - Government of Spain).In recent years with the rise of Cloud Computing (CC), many companies providing services in the cloud, are empowering a new series of services to their catalogue, such as data mining (DM) and data processing (DP), taking advantage of the vast computing resources available to them. Different service definition proposals have been put forward to address the problem of describing services in CC in a comprehensive way. Bearing in mind that each provider has its own definition of the logic of its services, and specifically of DM services, it should be pointed out that the possibility of describing services in a flexible way between providers is fundamental in order to maintain the usability and portability of this type of CC services. The use of semantic technologies based on the proposal offered by Linked Data (LD) for the definition of services, allows the design and modelling of DM services, achieving a high degree of interoperability. In this article a schema for the definition of DM services on CC is presented considering all key aspects of service in CC, such as prices, interfaces, Software Level Agreement (SLA), instances or DM work ow, among others. The new schema is based on LD, and it reuses other schemata obtaining a better and more complete definition of the services. In order to validate the completeness of the scheme, a series of DM services have been created where a set of algorithms such as Random Forest (RF) or KMeans are modeled as services. In addition, a dataset has been generated including the definition of the services of several actual CC DM providers, conforming the effectiveness of the schema.P12-TIC-2958 and TIN2016-81113-R (Ministry of Economy, Industry and Competitiveness - Government of Spain

    Fuzzy Systems-as-a-Service in Cloud Computing

    Get PDF
    Fuzzy systems have become widely accepted and applied in a host of domains such as control, electronics or mechanics. The software for construction of these systems has traditionally been exploited from tools, platforms and languages run on-premise computing infrastructure. On the other hand, rise and ubiquity of the cloud computing model has brought a revolutionary way for computing services deployment. The boost of cloud services is leading towards increasingly specific service offering just as data mining and machine learning service. Unfortunately, so far, no definition for fuzzy system as service is available. This paper identifies this opportunity and focus on developing a proposal for fuzzy system-as-a-service definition. To achieve this, the proposal pursues three objectives: the complete description of cloud services for fuzzy systems using semantic technology, the composition of services and the exploitation of the model in cloud platforms for integration with other services. As an illustrative case, a real-world problem is addressed with the proposed specification.This work was supported by the Research Projects P12-TIC-2958 and TIN2016-81113-R (Ministry of Economy, Industry and Competitiveness - Government of Spain)

    Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS

    Get PDF
    Neural networks are important standard machine learning procedures for classification and regression. We describe the R package RSNNS that provides a convenient interface to the popular Stuttgart Neural Network Simulator SNNS. The main features are (a) encapsulation of the relevant SNNS parts in a C++ class, for sequential and parallel usage of different networks, (b) accessibility of all of the SNNS algorithmic functionality from R using a low-level interface, and (c) a high-level interface for convenient, R-style usage of many standard neural network procedures. The package also includes functions for visualization and analysis of the models and the training procedures, as well as functions for data input/output from/to the original SNNS file formats.This work was supported in part by the Spanish Ministry of Science and Innovation (MICINN) under Project TIN-2009-14575. C. Bergmeir holds a scholarship from the Spanish Ministry of Education (MEC) of the \Programa de Formación del Profesorado Universitario (FPU)"

    Overall quality optimization for DQM stage in High Energy Physics experiments

    Get PDF
    Data Acquisition (DAQ) and Data Quality Monitoring (DQM) are key parts in the HEP data chain, where the data are processed and analyzed to obtain accurate monitoring quality indicators. Such stages are complex, including an intense processing work-flow and requiring a high degree of interoperability between software and hardware facilities. Data recorded by DAQ sensors and devices are sampled to perform live (and offline) DQM of the status of the detector during data collection providing to the system and scientists the ability to identify problems with extremely low latency, minimizing the amount of data that would otherwise be unsuitable for physical analysis. DQM stage performs a large set of operations (Fast Fourier Transform (FFT), clustering, classification algorithms, Region of Interest, particles tracking, etc.) involving the use of computing resources and time, depending on the number of events of the experiment, sampling data, complexity of the tasks or the quality performance. The objective of our work is to show a proposal with aim of developing a general optimization of the DQM stage considering all these elements. Techniques based on computational intelligence like EA can help improve the performance and therefore achieve an optimization of task scheduling in DQM.(MINECO - Gov. of Spain) P12-TIC-2958 TIN2016-81113-

    SCMFTS: Scalable and Distributed Complexity Measures and Features for Univariate and Multivariate Time Series in Big Data Environments

    Get PDF
    This research has been partially funded by the following grants: TIN2016-81113-R from the Spanish Ministry of Economy and Competitiveness, P12-TIC-2985 and P18-TP-5168 from Andalusian Regional Government, Spain, and EU Commission with FEDER funds. Francisco J. Baldan holds the FPI grant BES-2017-080137 from the Spanish Ministry of Economy and Competitiveness. D. Peralta is a Postdoctoral Fellow of the Research Foundation of Flanders (170303/12X1619N). Y. Saeys is an ISAC Marylou Ingram Scholar.Time series data are becoming increasingly important due to the interconnectedness of the world. Classical problems, which are getting bigger and bigger, require more and more resources for their processing, and Big Data technologies offer many solutions. Although the principal algorithms for traditional vector-based problems are available in Big Data environments, the lack of tools for time series processing in these environments needs to be addressed. In this work, we propose a scalable and distributed time series transformation for Big Data environments based on well-known time series features (SCMFTS), which allows practitioners to apply traditional vector-based algorithms to time series problems. The proposed transformation, along with the algorithms available in Spark, improved the best results in the state-of-the-art on the Wearable Stress and Affect Detection dataset, which is the biggest publicly available multivariate time series dataset in the University of California Irvine (UCI) Machine Learning Repository. In addition, SCMFTS showed a linear relationship between its runtime and the number of processed time series, demonstrating a linear scalable behavior, which is mandatory in Big Data environments. SCMFTS has been implemented in the Scala programming language for the Apache Spark framework, and the code is publicly available.Spanish Government TIN2016-81113-R BES-2017-080137Andalusian Regional Government, Spain P12-TIC-2985 P18-TP-5168European Commission European Commission Joint Research Centre European Commissio

    Complexity Measures and Features for Times Series classification

    Get PDF
    Classification of time series is a growing problem in different disciplines due to the progressive digitalization of the world. Currently, the state-of-the-art in time series classification is dominated by The Hierarchical Vote Collective of Transformation-based Ensembles. This algorithm is composed of several classifiers of different domains distributed in five large modules. The combination of the results obtained by each module weighed based on an internal evaluation process allows this algorithm to obtain the best results in state-of-the-art. One Nearest Neighbour with Dynamic Time Warping remains the base classifier in any time series classification problem for its simplicity and good results. Despite their performance, they share a weakness, which is that they are not interpretable. In the field of time series classification, there is a tradeoff between accuracy and interpretability. In this work, we propose a set of characteristics capable of extracting information on the structure of the time series to face time series classification problems. The use of these characteristics allows the use of traditional classification algorithms in time series problems. The experimental results of our proposal show no statistically significant differences from the second and third best models of the state-of-the-art. Apart from competitive results in accuracy, our proposal is able to offer interpretable results based on the set of characteristics proposed.Spanish Government TIN2016-81113-R PID2020-118224RB-I00 BES-2017-080137Andalusian Regional Government, Spain P12-TIC-2958 P18-TP-5168 A-TIC-388-UGR-1

    Ability to Predict Side-Out Performance by the Setter’s Action Range with First Tempo Availability in Top European Male and Female Teams

    Get PDF
    The aims of this study were to compare the Setter’s action range with availability of first tempo (SARA) between male and female volleyball; and to determine the relationship between several spatial and o ensive variables and their influence in the success of the side-out in male and female volleyball. A total of 1302 side-outs (639 male, 663 female) were registered (2019 European Championship). The ranking, reception e cacy, position and trajectory of the setter between reception and set, first tempo availability, side-out result, rotation, and attack lane were analyzed through Recursive Partitioning for classification, regression and survival tree models and classification and regression trees algorithms. Our results present female teams with more reduced SARAs than male teams, meaning female setters tend to play closer to the net. The correlation between the ranking and the distance from the average position of the setter to the ideal setting zone was not significant. A movement of the setter of 30 or less and more than 1 m in distance might improve the performance of the side-out. Depending on the spatial usage of the setter, some rotations might be more successful than others. When assessing performance, the teams should consider the ability to play quick attacks when their reception is not as precise as they would expect.German Research Foundation (DFG) FPU14/02234Spanish Ministry of Economy and Competitiveness - Spanish Ministry of Economy, Industry and Competitivity DEP2011-27503 TIN2016-81113-RFEDER-Junta de Andalucia, Consejeria de Economia y Conocimiento TIC.388.UGR1

    Memetic Algorithms with Local Search Chains in R: The Rmalschains Package

    Get PDF
    Global optimization is an important field of research both in mathematics and computer sciences. It has applications in nearly all fields of modern science and engineering. Memetic algorithms are powerful problem solvers in the domain of continuous optimization, as they offer a trade-off between exploration of the search space using an evolutionary algorithm scheme, and focused exploitation of promising regions with a local search algorithm. In particular, we describe the memetic algorithms with local search chains (MA-LS-Chains) paradigm, and the R package Rmalschains, which implements them. MA-LS-Chains has proven to be effective compared to other algorithms, especially in high-dimensional problem solving. In an experimental study, we demonstrate the advantages of using Rmalschains for high-dimension optimization problems in comparison to other optimization methods already available in R.This work was supported in part by the Spanish Ministry of Science and Innovation (MICINN) under Project TIN-2009-14575. The work was performed while C. Bergmeir held a scholarship from the Spanish Ministry of Education (MEC) of the “Programa de Formación del Profesorado Universitario (FPU)”
    corecore